-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add check for negative stripe index in ORC reader #10074
Add check for negative stripe index in ORC reader #10074
Conversation
CUDF_EXPECTS( | ||
stripe_idx >= 0 and stripe_idx < static_cast<decltype(stripe_idx)>( | ||
per_file_metadata[src_file_idx].ff.stripes.size()), | ||
"Invalid stripe index"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we could cast the auto=cudf::size_type
to match the size_t
that the size()
function is returning, rather than play type gymnastics by casting the result of size()
(requiring a decltype()
) and also checking >= 0
? The end result appears to be shorter and safer. (Excuse the lack of clang-format)
CUDF_EXPECTS( | |
stripe_idx >= 0 and stripe_idx < static_cast<decltype(stripe_idx)>( | |
per_file_metadata[src_file_idx].ff.stripes.size()), | |
"Invalid stripe index"); | |
CUDF_EXPECTS(static_cast<std::size_t>(stripe_idx) < per_file_metadata[src_file_idx].ff.stripes.size(), "Invalid stripe index"); |
This doesn't touch the loop, so I don't think it will break the necessary change for GCC 11 in #10045.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then we would be static casting to size_t a value that can be negative. IMO this option works "by accident", the explicit check here is preferable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a stripe index be negative? A size_type==int
can be negative but that doesn't mean that stripe_idx
should ever take on negative values (unless I'm unaware of how it's used). In the same way, we would need to know in the current snippet that the size()
call returning a size_t
won't overflow a decltype(stripe_idx)==size_type
when being cast. I would argue that the safety/correctness of either option is conditional on prior knowledge of the values, and is not an intrinsic guarantee of the choice of type/casting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stripe index is a value passed by the user, so we need to check if it's in valid range. It is imposible for size to be more than max size_type, since it's always (way) smaller than the number of rows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stripe index is a value passed by the user
Okay, that's what I was missing. I assumed it was a loop index or something where we had stronger guarantees about its potential values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this check is just input validation, which is what the failing test is validating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please feel free to merge this PR if it's fixing CI -- I don't to hold up CI for this suggestion so close to code freeze -- but let's discuss my suggestion further in the comment thread. edit: LGTM
Codecov Report
@@ Coverage Diff @@
## branch-22.02 #10074 +/- ##
================================================
- Coverage 10.49% 10.41% -0.08%
================================================
Files 119 119
Lines 20305 20541 +236
================================================
+ Hits 2130 2139 +9
- Misses 18175 18402 +227
Continue to review full report at Codecov.
|
@gpucibot merge |
Fixes CI failure